Features for speaker and language identification
نویسندگان
چکیده
Abstract In this paper we examine several features derived from the speech signal for the purpose of identification of speaker or language from the speech signal. Most of the current systems for speaker and language identification use spectral features from short segments of speech. There are additional features which can be derived from the residual of the speech signal, which correspond to the excitation source of speech signal. These features at the subsegmental (less than a pitch period) level correspond to the glottal vibration in each cycle, and at the suprasegmental (several pitch periods) level the features correspond to intonation and duration characteristics of speech. At the subsegmental level features can be extracted from the residual signal and also from the phase of the residual signal. The characteristics of speaker or language can be captured from the spectral or subsegmental features using Autoassociative Neural Network (AANN) models. We demonstrate that these features indeed contain speaker-specific and language-specific information. Since these features are more or less from independent sources, it is likely that they provide complementary information, which when combined suitably will increase the effectiveness of speaker and language identification systems.
منابع مشابه
Speaker vectors from subspace Gaussian mixture model as complementary features for language identification
In this paper, we explore new high-level features for language identification. The recently introduced Subspace Gaussian Mixture Models (SGMM) provide an elegant and efficient way for GMM acoustic modelling, with mean supervectors represented in a low-dimensional representative subspace. SGMMs also provide an efficient way of speaker adaptation by means of lowdimensional vectors. In our framewo...
متن کاملPhonetic Speaker Id
This paper describes the exploration of text-independent speaker identification using novel approaches based on speakers’ phonetic features instead of traditional acoustic features. Different phonetic speaker identification approaches are discussed in this paper and evaluated using two speaker identification systems: one multilingual system and one single language multiple-engine system. Furthe...
متن کاملOffline Language-free Writer Identification based on Speeded-up Robust Features
This article proposes offline language-free writer identification based on speeded-up robust features (SURF), goes through training, enrollment, and identification stages. In all stages, an isotropic Box filter is first used to segment the handwritten text image into word regions (WRs). Then, the SURF descriptors (SUDs) of word region and the corresponding scales and orientations (SOs) are extr...
متن کاملSpeaker Identification for Swiss German with Spectral and Rhythm Features
We present results of speech rhythm analysis for automatic speaker identification. We expand previous experiments using similar methods for language identification. Features describing the rhythmic properties of salient changes in signal components are extracted and used in an speaker identification task to determine to which extent they are descriptive of speaker variability. We also test the ...
متن کاملمقایسه روش های طیفی برای شناسایی زبان گفتاری
Identifying spoken language automatically is to identify a language from the speech signal. Language identification systems can be divided into two categories, spectral-based methods and phonetic-based methods. In the former, short-time characteristics of speech spectrum are extracted as a multi-dimensional vector. The statistical model of these features is then obtained for each language. The ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004